AITopics | acceptance ratio

Virtual Network Embedding (VNE) is a key enabler of network slicing, yet most formulations assume that each Virtual Network Request (VNR) has a fixed topology. Recently, VNE with Alternative topologies (VNEAP) was introduced to capture malleable VNRs, where each request can be instantiated using one of several functionally equivalent topologies that trade resources differently. While this flexibility enlarges the feasible space, it also introduces an additional decision layer, making dynamic embedding more challenging. This paper proposes HRL-VNEAP, a hierarchical reinforcement learning approach for VNEAP under dynamic arrivals. A high-level policy selects the most suitable alternative topology (or rejects the request), and a low-level policy embeds the chosen topology onto the substrate network. Experiments on realistic substrate topologies under multiple traffic loads show that naive exploitation strategies provide only modest gains, whereas HRL-VNEAP consistently achieves the best performance across all metrics. Compared to the strongest tested baselines, HRL-VNEAP improves acceptance ratio by up to \textbf{20.7\%}, total revenue by up to \textbf{36.2\%}, and revenue-over-cost by up to \textbf{22.1\%}. Finally, we benchmark against an MILP formulation on tractable instances to quantify the remaining gap to optimality and motivate future work on learning- and optimization-based VNEAP solutions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2512.05207

Country: Europe (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding

Walton, Thomas, Tsui, Darin, Musharaf, Aryan, Aghazadeh, Amirali

arXiv.org Artificial IntelligenceSep-29-2025

Autoregressive models have transformed protein engineering by enabling the generation of novel protein sequences beyond those found in nature. However, their sequential inference introduces significant latency, limiting their utility in high-throughput protein screening. Speculative decoding accelerates generation by employing a lightweight draft model to sample tokens, which a larger target model then verifies and refines. Yet, in protein sequence generation, draft models are typically agnostic to the structural and functional constraints of the target protein, leading to biologically implausible outputs and a shift in the likelihood distribution of generated sequences. We introduce SpecMER (Speculative Decoding via k-mer Guidance), a novel framework that incorporates biological, structural, and functional priors using k-mer motifs extracted from multiple sequence alignments. By scoring candidate sequences in parallel and selecting those most consistent with known biological patterns, SpecMER significantly improves sequence plausibility while retaining the efficiency of speculative decoding. SpecMER achieves 24-32% speedup over standard autoregressive decoding, along with higher acceptance rates and improved sequence likelihoods.

bioinformatics, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.21689

Country:

North America > United States (0.28)
Europe > France (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Biomedical Informatics (0.66)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

8a0fd48510590071e3c129a79b8b8527-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 18:42:42 GMT

artificial intelligence, machine learning, neuron, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Natick (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

9e7ba617ad9e69b39bd0c29335b79629-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 10:39:44 GMT

artificial intelligence, machine learning, target distribution, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

67d60c2694f4fecd18fa04d1fa8c0a5c-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 12:54:43 GMT

clip-score 0, hq-transformer, image generation, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Add feedback

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

Belfarsi, El Arbi, Brubaker, Sophie, Valero, Maria

arXiv.org Artificial IntelligenceJun-18-2025

Our research addresses the critical challenge of managing blood transfusions and optimizing allocation in resource-constrained regions. We present heuristic matching algorithms for donor-patient and blood bank selection, alongside machine learning methods to analyze blood transfusion acceptance data and predict potential shortages. We developed simulations to optimize blood bank operations, progressing from random allocation to a system incorporating proximity-based selection, blood type compatibility, expiration prioritization, and rarity scores. Moving from blind matching to a heuristic-based approach yielded a 28.6% marginal improvement in blood request acceptance, while a multi-level heuristic matching resulted in a 47.6% improvement. For shortage prediction, we compared Long Short-Term Memory (LSTM) networks, Linear Regression, and AutoRegressive Integrated Moving Average (ARIMA) models, trained on 170 days of historical data. Linear Regression slightly outperformed others with a 1.40% average absolute percentage difference in predictions. Our solution leverages a Cassandra NoSQL database, integrating heuristic optimization and shortage prediction to proactively manage blood resources. This scalable approach, designed for resource-constrained environments, considers factors such as proximity, blood type compatibility, inventory expiration, and rarity. Future developments will incorporate real-world data and additional variables to improve prediction accuracy and optimization performance.

artificial intelligence, blood bank, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.5220/0013182700003911

2506.12647

Country:

Asia (0.46)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.26)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Hematology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Efficient Private GPT Never Autoregressively Decodes

Li, Zhengyi, Guan, Yue, Yang, Kang, Feng, Yu, Liu, Ning, Yu, Yu, Leng, Jingwen, Guo, Minyi

arXiv.org Artificial IntelligenceMay-22-2025

The wide deployment of the generative pre-trained transformer (GPT) has raised privacy concerns for both clients and servers. While cryptographic primitives can be employed for secure GPT inference to protect the privacy of both parties, they introduce considerable performance overhead.To accelerate secure inference, this study proposes a public decoding and secure verification approach that utilizes public GPT models, motivated by the observation that securely decoding one and multiple tokens takes a similar latency. The client uses the public model to generate a set of tokens, which are then securely verified by the private model for acceptance. The efficiency of our approach depends on the acceptance ratio of tokens proposed by the public model, which we improve from two aspects: (1) a private sampling protocol optimized for cryptographic primitives and (2) model alignment using knowledge distillation. Our approach improves the efficiency of secure decoding while maintaining the same level of privacy and generation quality as standard secure decoding. Experiments demonstrate a $2.1\times \sim 6.0\times$ speedup compared to standard decoding across three pairs of public-private models and different network conditions.

large language model, machine learning, private model, (20 more...)

arXiv.org Artificial Intelligence

2505.15252

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > Canada (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)

Genre: Research Report (0.81)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts

Georganas, Evangelos, Kalamkar, Dhiraj, Kozlov, Alexander, Heinecke, Alexander

arXiv.org Artificial IntelligenceMar-17-2025

Speculative decoding (SD) has emerged as a method to accelerate LLM inference without sacrificing any accuracy over the 16-bit model inference. In a typical SD setup, the idea is to use a full-precision, small, fast model as "draft" to generate the next few tokens and use the "target" large model to verify the draft-generated tokens. The efficacy of this method heavily relies on the acceptance ratio of the draft-generated tokens and the relative token throughput of the draft versus the target model. Nevertheless, an efficient SD pipeline requires pre-training and aligning the draft model to the target model, making it impractical for LLM inference in a plug-and-play fashion. In this work, we propose using MXFP4 models as drafts in a plug-and-play fashion since the MXFP4 Weight-Only-Quantization (WOQ) merely direct-casts the BF16 target model weights to MXFP4. In practice, our plug-and-play solution gives speedups up to 2x over the BF16 baseline. Then we pursue an opportunity for further acceleration: the MXFP4 draft token generation itself can be accelerated via speculative decoding by using yet another smaller draft. We call our method ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts since it recursively applies speculation for accelerating the draft-token generation. Combining Multi-Level Speculative Decoding with MXFP4 Quantized Drafts we outperform state-of-the-art speculative decoding, yielding speedups up to 2.72x over the BF16 baseline.

large language model, machine learning, target model, (18 more...)

arXiv.org Artificial Intelligence

2503.13565

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology: